智能论文笔记

Towards Top-Down Deep Code Generation in Limited Scopes

Jian Gu , Harald C. Gall

分类：机器学习

2022-09-04

深度代码生成是软件工程深度学习（DL4SE）的主题，该主题采用神经模型来为预期功能生成代码。由于端到端的神经方法缺乏对域知识和软件层次结构的认识，因此结果通常需要手动校正。为了系统地探索代码生成的潜在改进，我们让IT参与从意图到实现的整个自上而下的发展，这在有限的范围中是可能的。在此过程中，它受益于大量样本，功能和知识。作为基金会，我们建议对代码数据（即代码分类法）建立分类法，利用代码信息的分类。此外，我们引入了三层语义金字塔（SP）以关联文本数据和代码数据。它标识了不同的抽象水平的信息，因此介绍了有关开发的领域知识，并揭示了软件的层次结构。此外，我们提出了一个语义金字塔框架（SPF）作为方法，重点是高模块化和低复杂性的软件。 SPF将代码生成过程分为阶段，并为潜在的相互作用提供储量。最终，我们为SPF构思了应用程序范围。

translated by 谷歌翻译

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal

Xinwei Liu , Jian Liu , Yang Bai , Jindong Gu , Tao Chen , Xiaojun Jia , Xiaochun Cao

分类：计算机视觉

2022-07-17

作为一种常见的安全工具，已广泛应用可见的水印来保护数字图像的版权。但是，最近的作品表明，可见的水印可以通过DNN删除而不会损坏其宿主图像。这样的水印驱动技术对图像的所有权构成了巨大威胁。受到DNN在对抗扰动方面的脆弱性的启发，我们提出了一种新颖的防御机制，可以永久地通过对抗机器学习。从对手的角度来看，可以将盲水水印网络作为我们的目标模型提出。然后，我们实际上优化了对宿主图像上不可察觉的对抗扰动，以主动攻击水印网络，称为水印疫苗。具体而言，提出了两种类型的疫苗。破坏水印疫苗（DWV）在通过水印拆除网络后，诱导了与水印一起破坏宿主图像。相比之下，不可行的水印疫苗（IWV）以另一种方式试图保持水印不清除且仍然明显。广泛的实验证明了我们的DWV/IWV在防止水印去除方面的有效性，尤其是在各种水印去除网络上。

translated by 谷歌翻译

Towards Target Sequential Rules

Wensheng Gan , Gengsen Huang , Jian Weng , Tianlong Gu , Philip S. Yu

分类：人工智能

2022-06-09

在许多实际应用程序中，顺序规则挖掘（SRM）可以为各种服务提供预测和建议功能。这是模式挖掘的重要技术，可以发现所有属于高频和高信顺序规则的有价值的规则。尽管提出了一些SRM的算法来解决各种实际问题，但没有关于目标顺序规则的研究。有针对性的顺序规则挖掘旨在挖掘用户关注的有趣的顺序规则，从而避免产生其他无效和不必要的规则。这种方法可以进一步提高用户在分析规则和减少数据资源消耗方面的效率。在本文中，我们提供了目标顺序规则的相关定义，并制定了目标顺序规则挖掘的问题。此外，我们提出了一种有效的算法，称为靶向顺序规则挖掘（TASRM）。引入了几种修剪策略和优化，以提高TASRM的效率。最后，在不同的基准测试上进行了大量实验，我们根据其运行时间，内存消耗和可扩展性以及具有不同查询规则的查询情况分析结果。结果表明，与现有的基线算法相比，新型算法TASRM及其变体可以实现更好的实验性能。

translated by 谷歌翻译

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

Yuan Yao , Qingxiu Dong , Jian Guan , Boxi Cao , Zhengyan Zhang , Chaojun Xiao , Xiaozhi Wang , Fanchao Qi , Junwei Bao , Jinran Nie

分类：自然语言处理

2021-12-27

实现通用语言情报是自然语言处理的长期目标，标准评估基准发挥基本和指导作用。我们认为，对于通用语言智能评估，基准本身需要全面和系统。为此，我们提出了Cuge，一种中文语言理解和生成评估基准，具有以下特征：（1）分层基准框架，其中数据集主要选择和组织语言能力 - 任务数据集层次结构。（2）多级评分策略，其中基于分层框架提供了不同级别的模型性能。为了促进CUGE，我们提供了一个公共排行榜，可以自定义，以支持灵活的模型判断标准。代表性预先训练的语言模型的评估结果表明了对通用语言智能的完善的充足空间。 Cuge在Cuge.baai.ac.cn上公开提供。

translated by 谷歌翻译

Multimodal Representation for Neural Code Search

Jian Gu , Zimin Chen , Martin Monperrus

分类：机器学习

2021-07-02

语义代码搜索是关于为给定的自然语言查询查找语义相关的代码片段。在最先进的方法中，代码和查询之间的语义相似度被量化为它们在共享矢量空间中的表示的距离。在本文中，为了改进向量空间，我们在AST的简化形式中引入树序列化方法，并为代码数据构建多模式表示。我们使用大规模和多语言：CodeSearchNet的单个语料库进行广泛的实验。我们的结果表明，我们的树序列化表示和多模阶学习模型都提高了代码搜索的性能。最后，我们定义了面向直观的量化指标，面向代码数据的语义和句法信息的完整性，以帮助了解实验结果。

translated by 谷歌翻译

Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang , Xiangtai Li , Jian Li , Liang Liu , Zhucun Xue , Boshen Zhang , Zhengkai Jiang , Tianxin Huang , Yabiao Wang , Chengjie Wang

分类：计算机视觉

2023-01-03

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

A Comparative Study of Image Disguising Methods for Confidential Outsourced Learning

Sagar Sharma , Yuechun Gu , Keke Chen

分类：机器学习

2022-12-31

Large training data and expensive model tweaking are standard features of deep learning for images. As a result, data owners often utilize cloud resources to develop large-scale complex models, which raises privacy concerns. Existing solutions are either too expensive to be practical or do not sufficiently protect the confidentiality of data and models. In this paper, we study and compare novel \emph{image disguising} mechanisms, DisguisedNets and InstaHide, aiming to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data. DisguisedNets are novel combinations of image blocktization, block-level random permutation, and two block-level secure transformations: random multidimensional projection (RMT) and AES pixel-level encryption (AES). InstaHide is an image mixup and random pixel flipping technique \cite{huang20}. We have analyzed and evaluated them under a multi-level threat model. RMT provides a better security guarantee than InstaHide, under the Level-1 adversarial knowledge with well-preserved model quality. In contrast, AES provides a security guarantee under the Level-2 adversarial knowledge, but it may affect model quality more. The unique features of image disguising also help us to protect models from model-targeted attacks. We have done an extensive experimental evaluation to understand how these methods work in different settings for different datasets.

translated by 谷歌翻译

Translating Text Synopses to Video Storyboards

Xu Gu , Yuchong Sun , Feiyue Ni , Shizhe Chen , Ruihua Song , Boyuan Li , Xiang Cao

分类：计算机视觉

2022-12-31

A storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards however remains challenging which not only requires association between high-level texts and images, but also demands for long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images to visualize the text synopsis. We construct a MovieNet-TeViS benchmark based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes that are manually selected from corresponding movies by considering both relevance and cinematic coherence. We also present an encoder-decoder baseline for the task. The model uses a pretrained vision-and-language model to improve high-level text-image matching. To improve coherence in long-term shots, we further propose to pre-train the decoder on large-scale movie frames without text. Experimental results demonstrate that our proposed model significantly outperforms other models to create text-relevant and coherent storyboards. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work.

translated by 谷歌翻译

Pontryagin Optimal Controller via Neural Networks

Chengyang Gu , Yize Chen

分类：机器学习

2022-12-30

Solving real-world optimal control problems are challenging tasks, as the system dynamics can be highly non-linear or including nonconvex objectives and constraints, while in some cases the dynamics are unknown, making it hard to numerically solve the optimal control actions. To deal with such modeling and computation challenges, in this paper, we integrate Neural Networks with the Pontryagin's Minimum Principle (PMP), and propose a computationally efficient framework NN-PMP. The resulting controller can be implemented for systems with unknown and complex dynamics. It can not only utilize the accurate surrogate models parameterized by neural networks, but also efficiently recover the optimality conditions along with the optimal action sequences via PMP conditions. A toy example on a nonlinear Martian Base operation along with a real-world lossy energy storage arbitrage example demonstrates our proposed NN-PMP is a general and versatile computation tool for finding optimal solutions. Compared with solutions provided by the numerical optimization solver with approximated linear dynamics, NN-PMP achieves more efficient system modeling and higher performance in terms of control objectives.

translated by 谷歌翻译